Here, we’re just setting a few options.

knitr::opts_chunk$set(
  warning = TRUE, # show warnings during codebook generation
  message = TRUE, # show messages during codebook generation
  error = TRUE, # do not interrupt codebook generation in case of errors,
                # usually better for debugging
  echo = TRUE  # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())

Now, we’re preparing our data for the codebook.

library(codebook)
codebook_data <- codebook::bfi
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
# codebook_data <- rio::import("mydata.csv")

# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
    only_labelled = TRUE, # only labelled values are autodetected as
                                   # missing
    negative_values_are_missing = FALSE, # negative values are missing values
    ninety_nine_problems = TRUE,   # 99/999 are missing values, if they
                                   # are more than 5 MAD from the median
    )

# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- rio::import("C:/Users/Omgjk/OneDrive - Emory University/Work/Lopman/Codebooks/participant_factor.rds")

Create codebook

codebook(codebook_data)

Metadata

Description

Dataset name: codebook_data

The dataset has N=304 rows and 9 columns. 303 rows have no missing values on any column.

Metadata for search engines
  • Date published: 2020-08-24
x
part_id
age
age_cat
gender
race
hispanic
edu
hh_str
state_res2

#Variables

part_id

Distribution

Distribution of values for part_id

Distribution of values for part_id

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
part_id numeric 0 1 1 152 304 152.5 87.90146 <U+2587><U+2587><U+2587><U+2587><U+2587> NA

age

Distribution

Distribution of values for age

Distribution of values for age

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
age numeric 0 1 21 36 78 39.47368 12.96196 <U+2587><U+2585><U+2583><U+2582><U+2581> NA

age_cat

Distribution

Distribution of values for age_cat

Distribution of values for age_cat

0 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
age_cat factor FALSE 1. 20-29,
2. 30-39,
3. 40-49,
4. 50-59,
5. 60+
0 1 5 20-: 90, 30-: 76, 40-: 60, 50-: 49 NA

gender

Distribution

Distribution of values for gender

Distribution of values for gender

0 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
gender factor FALSE 1. Female,
2. Male,
3. Prefer not to answer
0 1 3 Fem: 184, Mal: 116, Pre: 4 NA

race

Distribution

Distribution of values for race

Distribution of values for race

0 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
race factor FALSE 1. Black,
2. White,
3. Asian,
4. Mixed,
5. Other
0 1 5 Whi: 174, Mix: 52, Asi: 48, Bla: 26 NA

hispanic

Distribution

Distribution of values for hispanic

Distribution of values for hispanic

0 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
hispanic factor FALSE 1. None of these,
2. Yes
0 1 2 Non: 290, Yes: 14 NA

edu

Distribution

Distribution of values for edu

Distribution of values for edu

1 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
edu factor FALSE 1. Associate degree in college (2-year),
2. Bachelor’s degree in college (4-year),
3. Doctoral degree or Professional degree (PhD, JD, MD),
4. High school graduate (high school diploma or equivalent including GED),
5. Master’s degree,
6. Some college but no degree
1 0.9967105 6 Mas: 146, Bac: 118, Doc: 22, Som: 7 NA

hh_str

Distribution

Distribution of values for hh_str

Distribution of values for hh_str

0 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
hh_str factor FALSE 1. Live alone,
2. Live with parent,
3. Other,
4. Roommate or sibling,
5. Spouse and children only,
6. Spouse only
0 1 6 Spo: 97, Spo: 76, Liv: 44, Roo: 39 NA

state_res2

Distribution

Distribution of values for state_res2

Distribution of values for state_res2

0 missing values.

Summary statistics

name data_type ordered value_labels n_missing complete_rate n_unique top_counts label
state_res2 factor FALSE 1. Georgia,
2. Illinois,
3. Other,
4. Virginia
0 1 4 Geo: 148, Oth: 98, Vir: 30, Ill: 28 NA

Missingness report

Codebook table

JSON-LD metadata

The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.

{
  "name": "codebook_data",
  "datePublished": "2020-08-24",
  "description": "The dataset has N=304 rows and 9 columns.\n303 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name       |label | n_missing|\n|:----------|:-----|---------:|\n|part_id    |NA    |         0|\n|age        |NA    |         0|\n|age_cat    |NA    |         0|\n|gender     |NA    |         0|\n|race       |NA    |         0|\n|hispanic   |NA    |         0|\n|edu        |NA    |         1|\n|hh_str     |NA    |         0|\n|state_res2 |NA    |         0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.2).",
  "keywords": ["part_id", "age", "age_cat", "gender", "race", "hispanic", "edu", "hh_str", "state_res2"],
  "@context": "http://schema.org/",
  "@type": "Dataset",
  "variableMeasured": [
    {
      "name": "part_id",
      "@type": "propertyValue"
    },
    {
      "name": "age",
      "@type": "propertyValue"
    },
    {
      "name": "age_cat",
      "value": "1. 20-29,\n2. 30-39,\n3. 40-49,\n4. 50-59,\n5. 60+",
      "@type": "propertyValue"
    },
    {
      "name": "gender",
      "value": "1. Female,\n2. Male,\n3. Prefer not to answer",
      "@type": "propertyValue"
    },
    {
      "name": "race",
      "value": "1. Black,\n2. White,\n3. Asian,\n4. Mixed,\n5. Other",
      "@type": "propertyValue"
    },
    {
      "name": "hispanic",
      "value": "1. None of these,\n2. Yes",
      "@type": "propertyValue"
    },
    {
      "name": "edu",
      "value": "1. Associate degree in college (2-year),\n2. Bachelor's degree in college (4-year),\n3. Doctoral degree or Professional degree (PhD, JD, MD),\n4. High school graduate (high school diploma or equivalent including GED),\n5. Master's degree,\n6. Some college but no degree",
      "@type": "propertyValue"
    },
    {
      "name": "hh_str",
      "value": "1. Live alone,\n2. Live with parent,\n3. Other,\n4. Roommate or sibling,\n5. Spouse and children only,\n6. Spouse only",
      "@type": "propertyValue"
    },
    {
      "name": "state_res2",
      "value": "1. Georgia,\n2. Illinois,\n3. Other,\n4. Virginia",
      "@type": "propertyValue"
    }
  ]
}`